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Abstract 

We present a new method for automatically providing feedback for 
introductory programming problems. In order to use this method, 
we need a reference implementation of the assignment, and an er- 
ror model consisting of potential corrections to errors that students 
might make. Using this information, the system automatically de- 
rives minimal corrections to student's incorrect solutions, providing 
them with a quantifiable measure of exactly how incorrect a given 
solution was, as well as feedback about what they did wrong. 

We introduce a simple language for describing error models 
in terms of correction rules, and formally define a rule-directed 
translation strategy that reduces the problem of finding minimal 
corrections in an incorrect program to the problem of synthesizing 
a correct program from a sketch. We have evaluated our system on 
thousands of real student attempts obtained from 6.00 and 6.00x. 
Our results show that relatively simple error models can correct on 
average 65% of all incorrect submissions. 

1. Introduction 

There has been a lot of interest recently in making quality edu- 
cation more accessible to students worldwide using information 
technology. Several education initiatives such as EdX, Coursera, 
and Udacity are teaming up with experts to provide online courses 
on various college-level subjects ranging from computer science to 
physics and psychology. These courses, also called massive open 
online courses (MOOC), are typically taken by several thousands 
of students worldwide, and presents many interesting challenges 
that are not present in a traditional classroom setting consisting of 
only a few hundred students. One such challenge in these courses 
is to provide personalized feedback on practice exercises and as- 
signments to a large number of students. We consider the problem 
of providing automated feedback for online introductory program- 
ming courses in this paper. We envision this technology to be useful 
in a traditional classroom setting as well. 

The two most commonly used methods today for providing 
feedback on programming problems are: (i) test-cases based feed- 
back and (ii) peer-feedhack [7]. In automated test-cases based feed- 
back, the student program is run on a set of test cases and the failing 
test cases are reported back as feedback to the student. This is also 
how the 6.00x course (Introduction to Computer Science and Pro- 
graimning) offered by MITx currently provides feedback for the 
python progranmiing exercises. The provided feedback of failing 
test cases is however not ideal, especially for beginner program- 
mers, as they find it difficult to map the failing test cases to errors 
in their code. We found a lot of students posting their submissions 
on the discussion board seeking help from instructors and other 
students after struggling for several hours to correct the mistakes 
themselves. In fact, for the classroom version of the Introduction 
to Programming course (6.00) taught at MIT, the teaching assis- 



tants are required to manually go through each student submission 
and provide quaUtative feedback describing exactly what is wrong 
with the submission and how to correct it. This manual feedback by 
teaching assistants is simply prohibitive for the number of students 
in the online class setting. 

The second approach of peer-feedback is being suggested as a 
potential solution to this problem. In this approach, the peer stu- 
dents who are also taking the same course answer the posts on the 
discussion boards - this way the problem of providing feedback is 
distributed across several peer students. Unfortunately, providing 
quality feedback is a big challenge for experienced teaching as- 
sistants, and therefore it presents an even bigger challenge for the 
peer students who are also beginning to learn programming them- 
selves. From the 6.00x discussion boards, we observed that in many 
instances students had to wait several hours (or days) to get any 
feedback, and in many cases the feedback provided was either too 
general, incomplete or even wrong in a few cases. 

In this paper, we present an automated technique to provide 
feedback for introductory programming assignments. The approach 
leverages program synthesis technology to automatically determine 
minimal fixes to the student's solution that will make it match the 
behavior of a reference solution written by the instructor This tech- 
nology makes it possible to provide students with precise feedback 
about what they did wrong and how to correct them. The problem 
of providing automatic feedback appears to be related to the prob- 
lem of automated bug fixing, but it differs from it in following two 
significant respects: 

• The complete specification is known. An important challenge 
in automatic debugging is that there is no way to know whether 
a fix is addressing the root cause of a problem, or simply 
masking it and potentially introducing new errors. Usually the 
best one can do is check a candidate fix against a test suite 
or a partial specification [9]. While providing feedback on the 
other hand, the solution to the problem is known, and it is safe 
to assume that the instructor already wrote a correct reference 
implementation for the problem. 

• Errors are predictable. In a homework assignment, everyone 
is solving the same problem after having attended the same lec- 
tures, so errors tend to follow predictable patterns. This makes 
it possible to use a model-based feedback approach, where the 
potential fixes are guided by a model of the kinds of errors stu- 
dents typically make for a given problem. 

These simpUfying assumptions, however, introduce their own set 
of challenges. For example, since the complete specification is 
known, the tool now needs to reason about the equivalence of the 

student solution with the reference implementation. Also, in order 
to take advantage of the predictability of errors, the tool needs to be 
parameterized with models that describe the classes of errors. And 
finally, these programs can be expected to have higher density of 



errors than production code, so techniques like the one suggested 
by [18], which attempts to correct bugs one path at a time will not 
work for many of these problems that require coordinated fixes in 
multiple places. 

Our automated feedback generation technique handles all of 
these challenges. The tool can reason about the semantic equiva- 
lence of student programs and reference implementations written 
in a fairly large subset of python, so the instructor does not have 
to learn a new formalism to write specifications. The tool also pro- 
vides an error model language that can be used to write an error 
model: a very high level description of potential corrections to er- 
rors that students might make in the solution. When the system en- 
counters an incorrect solution by a student, it symbolically explores 
the space of all possible combinations of corrections allowed by the 
error model and finds a correct solution requiring a minimal set of 
corrections. 

We have evaluated our approach on thousands of student solu- 
tions on programming problems obtained from the 6.00x submis- 
sions and discussion boards, and from the 6.00 class submissions. 
These problems constitute a major portion of first month of assign- 
ment problems. Our tool can successfully provide feedback on over 
65% of the incorrect solutions. 

This paper makes the following key contributions: 

• We show that the problem of providing automated feedback for 
introductory programming assignments can be framed as a syn- 
thesis problem. Our reduction uses a constraint-based mecha- 
nism to model python's dynamic typing and supports complex 
python constructs such as closures, higher-order functions and 
list comprehensions. 

• We define a high-level language Eml that can be used to pro- 
vide correction rules to be used for providing feedback. We also 
show that a small set of such rules is sufficient to correct thou- 
sands of incorrect solutions written by students. 

• We report the successful evaluation of our technique on thou- 
sands of real student attempts obtained from 6.00 and 6.00x 
classes, as well as from Pex4Fun website. Our tool can pro- 
vide feedback on 65% of all submitted solutions that are incor- 
rect in about 10 seconds on average. 



2. Overview of the approach 

In order to illustrate the key ideas behind our approach, consider 
the problem of computing the derivative of a polynomial whose 
coefficients are represented as a list of integers. This problem is 
taken from week 3 problem set of 6.00x (PS3: Derivatives). Given 
the input list poly, the problem asks students to write the function 
computeDeriv that computes a list poly' such that 



poly 



, _ f {i X poly[i] I < i < len(poly)} if len(poly) > 1 
~ \ [0.0] if len(poly) = 1 



For example, if the input list poly is [2, —3, 1, 4] (denoting f[x) = 
Ax"^ + x"^ ~ 'ix + 2), the computeDeriv function should return 
[-3, 2, 12] (denoting the derivative /'(x) = 12x^ + 2x ~ 3). The 
reference implementation for the computeDeriv function is shown 
in Figure 1. This problem teaches concepts of conditionals and 
iteration over lists. For this problem, students struggled with many 
low-level python semantics issues such as the list indexing and 
iteration bounds. In addition, they also struggled with conceptual 
issues such as missing the corner case of handling lists consisting 
of single element (denoting constant function). 

A student solution for the computeDeriv problem taken from 
the 6.00x discussion forum' is shown in Figure 2(a). The student 
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1 def computeDeriv_list_int(poly_list_int) : 

2 result = [] 

3 for i in range(len(poly_list_int) ) : 

4 result += [i * poly_list_int[i] ] 

5 if len(poly_list_int) == 1: 

6 return result # return [0] 

7 else: 

8 return result [1:] # remove the leading 

Figure 1. The reference implementation for computeDeriv func- 
tion. 



posted the code in the forum seeking help and received two re- 
sponses. The first response asked the student to look for the first 
if-block return value, and the second response said that the code 
should return [0] instead of empty list for the first if statement. 
There are many different ways to modify the code to return [0] for 
the case len(poly) = l. The student chose to change the initializa- 
tion of the deriv variable from [ ] to the list [0]. The problem with 
this modification is that the result will now have an additional in 
front of the output list for all input lists (which is undesirable for 
lists of length greater than 1). The student then posted the query 
again on the forum on how to remove the leading from result, but 
unfortunately this time did not get any more response. 

Our tool generates the feedback shown in Figure 2(b) for the 
student program in about 40 seconds. During these 40 seconds, 
the tool searches over more than 10^ candidate fixes and finds the 
fix that requires minimum number of corrections. There are three 
problems with the student code: first it should return [0] in line 5 as 
was suggested in the forum but wasn't specified how to make the 
change, second the if block should be removed in line 7, and third 
that the loop iteration should start from index 1 instead of in line 
6. The generated feedback consists of four pieces of information 
(shown in bold in the figure for emphasis): 

• the location of the error denoted by the line number. 

• the problematic expression in the line. 

• the sub-expression which needs to be modified. 

• the new modified value of the sub-expression. 

The feedback generator is parameterized with a feedback-level 
parameter to generate feedback consisting of different combina- 
tions of the four information depending on how much information 
the instructor is willing to provide to the student. 

2.1 Workflow 

In order to provide the level of feedback described above, the tool 
needs some information from the instructor. First, the tool needs to 
know what the problem is that the students are supposed to solve. 
The instructor provides this information by writing a reference im- 
plementation such as the one in Figure 1 . Since python is dynam- 
ically typed, the instructor also provides the types of function ar- 
guments and return value. In Figure 1 , the instructor specifies the 
type of input argument to be list of integers (poly_list_int) by 
appending the type to the name. 

In addition to the reference implementation, the tool needs a 
description of the kinds of errors students might make. We have 
designed an error model language EML, which can describe a set 
of correction rules that denote the potential corrections to errors 
that students might make. For example, in the student attempt in 
Figure 2(a), we observe that corrections often involve modifying 
the return value and the range iteration values. We can specify this 



1 def computeDeriv(poly) : 



2 
3 
4 
5 
6 
7 
8 
9 
10 
11 



deriv = [] 
zero = 

if {len(poly) == 1) : 

return deriv 
for expo in range (0, len(poly)): 

if {poly[expo] == 0) : 
zero += 1 

else: 

deriv . append (poly [expo] *expo) 
return deriv 

(a) Student's solution 



The program requires 3 changes: 

• In the return statement return deriv in line 5, replace deriv by [0]. 

• In the comparison expression (poly[expo] == 0) in line 7, change 
(poly[expo] == 0) to False. 

• In the expression range(0, len(poly)) in line 6, increment by 1. 



(b) Generated Feedback 



Figure 2. (a) A student's computeDeriv solution from the 6.00x discussion board and (b) the feedback generated by our tool on this solution. 



information with the following three correction rules: 

return a — > return [0] 

range(ai,a2) — > range(ai + 1, 02) 

ao == ai — > False 

The correction rule return a — >■ return [0] states that the expres- 
sion of a return statement can be optionally replaced by [0]. The 
error model for this problem that we use for our experiments is 
shown in Figure 8, but we will use this simple error model for sim- 
plifying the presentation in this section. In later experiments, we 
also show how only a few tens of incorrect solutions can provide 
enough information to create an error model that can automatically 
provide feedback for thousands of incorrect solutions. 

The tool now needs to explore the space of all candidate pro- 
grams based on applying these correction rules to the student pro- 
gram, and compute the candidate program that is equivalent to the 
reference implementation and that requires minimum number of 
corrections. We use constraint-based synthesis technology [11, 24, 
27] to efficiently search over this large space of programs. Specif- 
ically, we use the SKETCH synthesizer that uses a sat-based al- 
gorithm to complete program sketches (programs with holes) so 
that they meet a given specification. We extend the SKETCH syn- 
thesizer with support for minimize hole expressions whose val- 
ues are computed efficiently by using incremental constraint solv- 
ing. To simplify the presentation, we use a simpler language mPy 
(miniPython) in place of python to explain the details of our algo- 
rithm. In practice, our tool supports a fairly large subset of python 
including closures, higher order functions and list comprehensions. 

2.2 Solution Strategy 




Figure 3. The architecture of our automated feedback generation 
tool. 

The architecture of our tool is shown in Figure 3. The solu- 
tion strategy to find minimal corrections to a student's solution is 



1 def computeDeriv(poly) : 

2 deriv = [] 
zero = 
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4 
5 
6 
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11 



if {{ 



len(poly) == 1 , False}): 



return { 



deriv 



,[0]} 



for expo in range ({|0|,1}, len(poly)): 
if ({" 



poly [expo] 



False}) : 



zero += 1 
else: 

deriv . append (poly [expo] *expo) 



return { deriv , [0] } 



Figure 4. The resulting mPy program after applying correction 
rules to program in Figure 2(a). 



based on a two-phase translation to the Sketch synthesis language. 
In the first phase, the Program Rewriter uses the correction rules 
to translate the solution into a language we call mPy; this language 
provides us with a concise notation to describe sets of mPy candi- 
date programs, together with a cost model to reflect the number of 
corrections associated with each program in this set. In the second 
phase, this mPy program is translated into a sketch program by the 
Sketch Translator. 

In the case of example in Figure 2(a), the Program Rewriter 
produces the mPy program shown in Figure 4 using the correc- 
tion rules from Section 2.1. This program includes all the possi- 
ble corrections induced by the correction rules in the model. The 
mPy language extends the imperative language mPy with expres- 
sion choices, where the choices are denoted with squiggly brack- 
ets. Whenever there are multiple choices for an expression or a 
statement, the zero-cost choice, the one that will leave the ex- 
pression unchanged, is boxed. For example, the expression choice 
{ | gp | , ai, • • • , a„} denotes a choice between expressions ao, • ■ •, 
a„ where ao denotes the zero-cost default choice. 

For this simple program, the three correction rules induce a 
space of 32 different candidate programs. This candidate space is 
fairly small, but the number of candidate programs grow exponen- 
tially with the number of correction places in the program and with 
the number of correction choices in the rules. The error model that 
we use in our experiments induces a space of more than 10^^ can- 
didate programs for some of the benchmark problems. In order to 
search this large space efficiently, the program is translated to a 
sketch by the Sketch Translator. 



struct MultiType{ 
int val, type; 
MTList 1st; 

} 



struct MTList{ 
int len; 

MultiType[len] IVals; 

} 



Figure 5. The definition of MultiType struct for encoding dynamic 
types in python. 



2.3 Synthesizing Corrections with Sketch 

The Sketch [24] synthesis system allows programmers to write 
programs while leaving fragments of it unspecified as holes; the 
contents of these holes are filled up automatically by the synthe- 
sizer such that the program conforms to a specification provided 
in terms of a reference implementation. The synthesizer uses the 
CEGIS algorithm [25] to efficiently compute the values for holes 
and uses bounded symbolic verification techniques for performing 
equivalence check of the two implementations. 

There are two key aspects in the translation of an mPy program 
to a Sketch program. The first aspect is specific to the python lan- 
guage. Sketch supports high-level features such as closures and 
higher-order functions which simplifies the translation, but it is stat- 
ically typed whereas mPy programs (like python) are dynamically 
typed. The translation models the dynamically typed variables and 
operations over them using struct types in SKETCH in a way sim- 
ilar to the union types. The second aspect of the translation is the 
modeling of set-expressions in mPy using ?? (holes) in SKETCH, 
which is language independent. 

The dynamic variable types in mPy language are modeled 
using the MultiType struct defined in Figure 5. The MultiType 
struct consists of a type field that denotes the dynamic type 
of variables and currently supports the following set of types 
{INTEGER, BOOL, TYPE, LIST, TUPLE}. The val field stores the 
integer value or the Boolean value of the variables, whereas the 
1st field of type MTList stores the value of list and tuple vari- 
ables. The MTList struct consists of a field len that denotes the 
length of the list and a field IVals of type array of MultiType 
that stores the list elements. For example, the integer value 5 is 
represented as the value MultiType (val=5, f lag=INTEGER) and 
the list [1,2] is represented as the value MultiType(lst=new 
MTList (len=2,lVals={new MultiType(val=l,flag=INTEGER) , 
new MultiType(val=2,flag=INTEGER)}) , flag=LIST). 

The second key aspect of this translation is the translation of ex- 
pression choices in mPy. The SKETCH construct ?? denotes an un- 
known integer hole that can be assigned any constant integer value 
by the synthesizer. The expression choices in mPy are translated 
to functions in SKETCH that based on the unknown hole values 
return either the default expression or one of the other expression 
choices. Each such function is associated with a unique Boolean 
choice variable, which is set by the function whenever it returns 
a non-default expression choice. For example, the set-statement 
return { deriv ,[0]}; (line 5 in Figure 4) is translated to return 
modRet\/al0(deriv), where the modRetValO function is defined as: 



MultiType modRet\/al0{MultiType a){ 
if(??) return a; // default choice 
choiceRetValO = True; // non-default choice 
MTList list = new MTList (lVals={new 

MultiType(val=0, f lag=INTEGER) } , len=l); 
return new MultiType(lst=list, type = LIST); 

} 

The translation phase also generate a SKETCH harness that calls 
and compares the outputs of the translated student and reference 
implementations on all inputs of a bounded size. For example in 
case of the computeDeriv function that takes a list as input, with 



bounds of n = 4 for both the number of integer bits and the maxi- 
mum length of the input list, the harness makes sure that the output 
of the two implementations match for more than 2^^ different in- 
puts as opposed to 10 test-cases used in 6.00x. The harness also 
defines a totalCost variable as a function of all choice variables 
that computes the total number of modifications performed in the 
original program, and asserts that the value of totalCost should 
be minimized. The synthesizer then solves this minimization prob- 
lem efficiently using an incremental solving algorithm CEGISMIN 
described in Section 4.2. 

After the synthesizer finds a solution, the Feedback Generator 
uses the solution to the unknown integer holes in the sketch to 
compute the choices made by the synthesizer and generates the 
corresponding feedback. For this example, the tool generates the 
feedback shown in Figure 2(b) in less than 40 seconds. 

3. Eml: Error IModel Language 

In this section, we describe the syntax and semantics of the error 
model language EML. An EML error model consists of a set of 
rewrite rules that captures the potential corrections for mistakes that 
students might make in their solutions. We define the rewrite rules 
over a simple python-like imperative language mPy. A rewrite rule 
transforms a program element in mPy to a set of weighted mPy 
program elements. This weighted set of mPy program elements is 
represented succinctly as an mPy program element, where mPy 
extends the mPy language with set-exprs (set of expressions) and 
set-stmts (set of statements). The weight associated with a program 
element in this set denotes the cost of performing the corresponding 
correction. An error model transforms an mPy program to an 
mPy program (representing a set of mPy programs) by recursively 
applying the rewrite rules. We show that this transformation is 
deterministic and is guaranteed to terminate on well-formed error 
models. 

3.1 mPy and mPy languages 

The syntax for the simple imperative language mPy is shown in 
Figure 6(a) and the syntax of mPy language is shown in Fig- 
ure 6(b). The purpose of mPy language is to represent a large col- 
lection of mPy programs succinctly. The mPy language consists 
of set-expressions (5 and 6) and set-statements (s) that represent 
a weighted set of corresponding mPy expressions and statements 
respectively. For example, the set expression { | np | , ■ ■ ■ ,nk} rep- 
resents a weighted set of constant integers where no denotes the de- 
fault integer value associated with cost and all other integer con- 
stants (ni, ■ ■ • , Uk) are associated with cost 1. The sets of compos- 
ite expressions are represented succinctly in terms of sets of their 
constituent sub-expressions. For example, the composite expres- 
sion { | ao | , ao + <, >, >, ==, /}{ | ai | , a-i + 1, ai - 1} 
represents 36 mPy expressions. 

Each mPy program in the set of programs represented by an 
mPy program is associated with a cost (weight) that encodes the 
number of modifications performed in the original program to 
obtain the transformed program. This cost allows the tool to search 
for corrections that require minimum number of modifications. The 
weighted set of mPy programs is defined using the [ ] function 
shown partially in Figure 7, the complete function definition can 
be found in [1]. The | ] function on mPy expressions such as 
a returns a singleton set {(a,0)} consisting of the corresponding 
expression that is associated with cost 0. On set-expressions of the 
form { ao , ■ • ■ , a„}, the function returns the union of the weighted 
set of mPy expressions corresponding to the default set-expression 
(|2o]) and the weighted set of expressions corresponding to other 
set-expressions (ai,---,5„), where each expression in |ai)] 



Arith Expr a 



Arith Op opa 
Bool Expr b 
Comp Op opc 
Bool Op opb 
Stmt Expr s 



Func Def. p := 



n I [ ] I V I a[a] \ oo opa ai 
[ai, - ■ ■ ,a„] I /(oo, ■■■ ,a„) 
ao if b else ai 
+ I - I X I / I ** 
not 6 I Oo opc oi I 60 opb bi 

== 1 < I > I < I > 
and I or 

v = a I so;si | while 6 : s 
if 6 : So else: Si 
forooinai: s | return a 
def /(«!,•• -,«„): s 

(a) mPy 



Arith set-expr a := 



set-op op^ 



Bool set-expr b 
Stmt set-expr s 

Func Def p 



I { 



ao 



[ao, ■ ■ ■ ,a„\ 

OPa I { Sp^ 



, an} I a[a] | ao op^ ai 
f{do, ■ ■ ■ , an) 

■ ■ ■ : "Px„ } 



6 I { bo , ■ ■ ■ , bn} I not 6 I ao op^ ai \ bo op^ bi 

s I { 



So 



, s„} I V := a I so;si 
while 6 : s | foraoinai: s 
if 6 : So else : si | return o 
def /(ai,---,a„) s 

(b) mPy 



Figure 6. The syntax for (a) mPy and (b) mPy languages. 



associated with an additional cost of 1. On composite expressions, 
the function computes the weighted set recursively by taking the 
cross-product of weighted sets of its constituent sub-expressions 
and adding their corresponding costs. For example, the weighted 
set for composite expression x[y] consists of an expression Xi[yj] 
associated with cost c^^ -|- Cy^ for each {xi,Cxi) € {xj and 



I{ 



H = {(«,o)} 

ooj, ■■■,an}] = |oo] U {(a, c-l- 1) I (a,c) € |af]o<i<n} 
|ao[oi]] = {(ao[oi], co + ci) | (at, a) € [ail<s{o,i}} 
[while 6 : s] = {(while 6 : s, cj, + c^) | 

(h.r,) e \bj.(-^.rs) e m} 



Figure 7. The | ] function (shown partially) that translates an mPy 
program to a weighted set of mPy programs. 



3.2 Syntax of Eml 

An Eml error model consists of a set of jxyjection rules that are 
used to transform an mPy program to an mPy program. A correc- 
tion rule C is written as a rewrite rule Lj^ R, where L and R de- 
note a program element in mPy and mPy respectively. A program 
element can either be a term, an expression, a statement, a method 
or the program itself. The left hand side (L) denotes an mPy pro- 
gram element that is pattern matched to be transformed to an mPy 
program element denoted by the right hand side {R). The left hand 
side of the rule can use free variables whereas the right hand side 
can only refer to the variables present in the left hand side. The 
language also supports a special / (prime) operator that can be used 
to tag sub-expressions in R that are further transformed recursively 
using the error model. The rules use a shorthand notation ?a (in 
the right hand side) to denote the set of all variables that are of 
the same type as the type of expression a and are in scope at the 
corresponding program location. We assume each correction rule 
is associated with cost 1, but it can be easily extended to different 
costs to account for different levels of mistakes. 

Example 1. The error model for the computeDeriv problem is 
shown in Figure 8. The IndR rewrite rule transforms the list access 
indices. The InitR rule transforms the right hand size of constant 



IndR: u [a] v[{a + 1, a - 1, ?a}] 

InitR: V = n v = {n + l,n - 1,0} 

RanR: range(oo,oi) — ^ range({0, 1, ao — 1, ao -|- 1}, 

{ai + l,ai - 1}) 
CompR: ao opc ai ->■ {{aj, — 1, ?ao} op^, {oi — 1, 0, 1, ?ai}, 

True, False} 
where ap^ = {<, >, <, >, ==, ^} 
RetR: return o — >■ return{[0] if len(a) == 1 else a, 

o[l :] if (len(a) > 1) else 0} 

Figure 8. The error model £ for the computeDeriv problem. 

initializations. The RanR rule transforms the arguments for the 
range function; similar rules are defined in the model for other 
range functions that take one and three arguments. The CompR 

rule transforms the operands and operator of the comparisons. 
The RetR rule adds the two common corner cases of returning [0] 
when the length of input list is 1, and the case of deleting the first 
list element before returning the list. Note that these rewrite rules 
define the corrections that can be performed optionally; the zero 
cost (default) case of not correcting a program element is added 
automatically as described in Section 3.3. 

Definition 1. Well-formed Rewrite Rule : A rewrite rule C : L ^ 
R is defined to be well-formed if all tagged sub-terms t' in R have 
a smaller size syntax tree than that ofL. 

The rewrite rule Ci : v[a] — > {(i)[a])' + 1} is not a well-formed 
rewrite rule as the size of the tagged sub-term (v[a]) of R is the 
same as that of the left hand side L. On the other hand, the rewrite 
rule C2 : v[a] — > {f '[a'] + 1} is well-formed. 

Definition 2. Well-formed Error Model : An error model £ is 
defined to be well-formed if all of its constituent rewrite rules 

Ci a £ are well-formed. 

3.3 Transformation witli Eml 

An error model £ is syntactically translated to a function Te that 

transforms an mPy program to an mPy program. The Te function 
first traverses the program element w in the default way, i.e. no 
transformation happens at this level of the syntax tree, and the 
method is called recursively on all of its top-level sub-terms t to 



obtain the transformed element wo £ mPy. For each correction 
rule d : Li Ri in the error model £, the method contains a 
Match expression that matches the term w with the left hand side 
of the rule Li (with appropriate unification of the free variables in 
Li). If the match succeeds, it is transformed to a term Wi G mPy 
as defined by the right hand side Ri of the rule after applying 
the Te method recursively on each one of its tagged sub-terms 
t' . Finally, the method returns the set of all transformed terms 

■ ■ ■ ,Wn}. 

Example 2. Consider an error model £i consisting of the follow- 
ing three correction rules: 

Ci : v[a] -> v[{a -l,a+ 1}] 

C2 ■■ ao opc ai {ad - 1, 0} opc {a'l - 1, 0} 

C3 -.via] 7v[a] 

The transformation function Tei for the error model £1 is shown 
in Figure 9. 

Tsiiw : MPy) : MPY = 
let Wo = 'w\t — >■ Tci it)] in (* f : a sub-term of w *) 
let uii — Match w with 

v[a\ — )• v[{a + 1, a — 1}] in 
let ■W2 — Match w with 

ao opc ffli {7£i (ao) -1,0} opc 

{r£i(ai)-l,0}in 

{ | ^"0 | , Wx,W2} 



Figure 9. The Tfi method for error model £1 . 

The recursive steps of application of Te^ function on expression 
{x[i\ < y[j]) are shown in Figure 10. This example illustrates two 
interesting features of the transformation function: 

• Nested Transformations : Once a rewrite rule L — > i? is ap- 
plied to transform a program element matching L to R, the in- 
structor may want to apply another rewrite rule on only a few 
sub-terms of R. For example, she may want to avoid trans- 
forming the sub-terms which have already been transformed 
by some other correction rule. The Eml language facilitates 
making such distinction between the sub-terms for performing 
nested corrections using the / (prime) operator Only the sub- 
terms in R that are tagged with the prime operator are visited 
for applying further transformations (using the Te method re- 
cursively on its tagged sub-terms t'), whereas the remaining 
non-tagged sub-terms are not transformed any further. After ap- 
plying the rewrite rule C2 in the example, the sub- terms x[i] and 
y[j] are further transformed by applying rewrite rules Ci and 
C3. 

• Ambiguous Transformations : While transforming a program 
using an error model, it may happen that there are multiple 
rewrite rules that pattern match the program element w. After 
applying rewrite rule C2 in the example, there are two rewrite 
rules Ci andCs that pattern match the terms x[i] and y[j]. After 
applying one of these rules (Ci or C3) to an expression v[a], we 
cannot apply the other rule to the transformed expression. In 
such ambiguous cases, the Te function creates a separate copy 
of the transformed program element (wi) for each ambiguous 
choice and then performs the set union of all such elements 
to obtain the transformed program element. This semantics 



of handling ambiguity of rewrite rules also matches naturally 
with the intent of the instructor. If the instructor wanted to 
perform both transformations together on array accesses, she 
could have provided a combined rewrite rule such as v[a] — >■ 
?t;[{a+ l,a- 1}]. 

Theorem 1. Given a well-formed error model £, the transforma- 
tion function Te always terminates. 

Proof. From the definition of well-formed error model, each of its 
constituent rewrite rule is also well-formed. Hence, each applica- 
tion of a rewrite rule reduces the size of the syntax tree of terms that 
are required to be visited further for transformation by Te . There- 
fore, the Te function terminates in a finite number of steps. □ 

4. Constraint-based Solving of mPy programs 

In the previous section, we saw the transformation of an mPy pro- 
gram to an mPy program based on an error model. We now present 
the translation of mPy programs into SKETCH programs [24]. 

4.1 Translation of mPy programs to Sketch 

The mPy programs are translated to SKETCH programs to per- 
form constraint-based analysis. The main aspects of the transla- 
tion include the translation of : (i) python-like constructs in mPy to 
Sketch, and (ii) set-expr choices in mPy to SKETCH functions. 

Handling dynamic typing of mPy variables The dynamic typ- 
ing of mPy is handled using Multilype variable as described in 
Section 2.3. The mPy expressions and statements are transformed 
to Sketch functions that perform the corresponding transforma- 
tions over Multilype. For example, the python statement (a = b) 
is translated to assignMT(a, b), where the assignMT function as- 
signs Multilype b to a. Similarly, the binary add expression (a + 
b) is translated to binOpMT(a, b, ADD_OP) that in turn calls the 
function addMT(a,b) to add a and b as shown in Figure 11. 

1 Multilype addMT( Multilype a, Multilype b){ 

2 assert a. flag == b.flag; // same types can be added 

3 if (a. flag == INTEGER) // add for integers 

4 return new MultiType(val=a . val+b. val , flag = 

INTEGER) ; 

5 if(a.flag == LIST){ // add for lists 

6 int newLen = a.lst.len + b.lst.len; 

7 MultiType[newLen] newLVals = a.lst.lVals; 

8 for{int i=0; i<b.lst.len; i++) 

9 newLVals[i+a.lst.len] = b . 1st . IVals [i] ; 

10 return new MultiType(lst = new 

MTList(lVals=newLVals, len=newLen) , 
flag=LIST) ;} 

11 

12 } 



Figure 11. The addMT function for adding two MultiType a and b. 

Translation of mPy set-expressions The set-expressions in mPy 
are translated to SKETCH functions. The function bodies obtained 
from translation (<1>) of some of the interesting mPy constructs 
are shown in Figure 12. The SKETCH construct ?? (called hole) 
is a placeholder for a constant value, which is filled up by the 
Sketch synthesizer while solving the constraints to satisfy the 
given specification. 

The singleton sets consisting of an mPy expression such as 
{a} are translated simply to the corresponding expression itself. 



nx[{\ < v\j]) = { T{x[i]) < rm) ,{nx[{\) - 1,0} < mym) - 1,0}} 



T{x[{\) = { T{x)[T{i)] ,x[{i + l,i-l}],y[i]} 



Tivm = { TivWU)] ,yl{j + l,j - l}],x\j]} 



Therefore, after substitution the result is: 



nx[i] < y\j]) = { 





,x[{i + l,i-l}],y[i\} <{ 


0101 


,y[{j + l,j-l}],x[j]} 




J]] ,x[{i + l,i-l}],y[i\}-l,Q} <{{ 




,y[{j + i,j 



Figure 10. Application of Tfi (abbreviated T ) on expression {x[i] < y\j]). 



Hi 



ao 



*({ao, 



$({a}) 
• ■ ,an}) 
■■,an}) 



$(ao[oi]) 
$(oo = ai) 



if (??) $(ao) else $({oi, • • • , o„}) 

if (??) {choicek = l;$(ao)} 
else $({ai, • • • ,an}) 
$(ao)[$(ai)] 
$(ao) := *(oi) 



Figure 12. The translation rules (shown partially) for converting 
mPy set-exprs to corresponding Sketch function bodies. 



A set-expression of the form { ao 
sively to the if expression :if (??~ 



,a„} is translated recur- 
$(ao) else $({ai,- • ■ ,a„}). 



which means that the synthesizer can optionally select the default 
set-expression $(5()) (by choosing ?? to be true) or select one 
of the other choices (5i, ■ ■ ■ , a„). The set-expressions of the form 
{fio, • ■ , 5„} are similarly translated but with an additional state- 
ment for setting a fresh variable choicei^ if the synthesizer selects 
the non-default choice ao. 

The translation rules for the assignment statements (ao := 
ai) results in if expressions on both left and right sides of the 
assignment. The if expression choices occurring on the left hand 
side are desugared to individual assignments. For example, the left 
hand side expression if (??) x else y :— 10 is desugared to 
if (??) X :— 10 else y :— 10. The infix operators in mPy are 
first translated to function calls and are then translated to sketch 
using the translation for set-function expressions. The remaining 
mPy expressions are similarly translated recursively as shown in 
the figure. 

Translating function calls The translation of function calls for 
recursive problems and for problems that require writing a function 
that uses other sub-functions is parmeterized by three options: 

1) use the student's implementation of sub-functions, 2) use the 
teacher's implementation of sub-functions, and 3) treat the sub- 
functions as uninterpreted functions. 

Generating the driver functions The Sketch synthesizer sup- 
ports checking equivalence of functions whose input arguments 
and return values are over SKETCH primitive types such as int, 
bit and arrays. Therefore, after the translation of mPy programs to 



Sketch programs, we need additional driver functions to integrate 
functions using MultiType input arguments and return value to 
corresponding functions over Sketch primitive types. The driver 
functions first converts the input arguments over primitive types 
to corresponding MultiType variables using library functions such 
as computeHTFromInt, and then calls the translated mPy func- 
tion with the MultiType variables. The returned MultiType value 
is translated back to primitive types using library functions such 
as conputelntFronMT. The driver function for student's programs 
also consists of additional statements of the form if(choicek) 
totalCost++; and the statement mininiize(totalCost), which 
tells the synthesizer to compute a solution to the Boolean variables 
choicei^ that minimizes the totalCost variable. 

4.2 Incremental Solving for the Minimize hole expressions 
Algorithm 1 CEGISMIN Algorithm for Minimize expression 



0, $, 







o"o ^ o"randonn « 
while (True) 

$i ^ Synth(o-i 
if (3>i = UNSAT) 

if (*prev = null) return UNSAT_SKETCH 

else return PE(P,^p) 
choose (f) e 
(Ji ^ Verify(0) 
if ((Ti = null) 

(minHole, minHoleValue) 



null 



> Synthesis Phase 
> Synthesis Fails 



Verification Phase 
> Verification Succeeds 
getMinHoleValue(0) 



$i $f U {encode(minHole < minHoleVal)} 



We extend the CEGIS algorithm in SKETCH [24] to get the 
CEGISMIN algorithm shown in Algorithm 1 for efficiently solving 
sketches that include a minimize hole expression. The input state 
of the sketch program is denoted by a whereas the sketch constraint 
store denoted by Initially, the input state ao is assigned a ran- 
dom input state value and the constraint store $0 is assigned the 
constraint set obtained from the sketch program. The variable (pp 
stores the previous satisfiable hole values and is initialized to null. 
In each iteration of the loop, the synthesizer first performs the in- 
ductive synthesis phase where it shrinks the constraints set 
to $i by removing behaviors from 'I'i-i that do not conform to 
the input state CTi-i. If the constraint set becomes unsatisfiable, it 



either returns the sketch completed with hole values from the previ- 
ous solution if one exists, otherwise it returns UNSAT. On the other 
hand, if the constraint set is satisfiable, then it first chooses a con- 
forming assignment to the hole values and goes into the verification 
phase where it tries to verify the completed sketch. If the verifier 
fails, it returns a counter-example input state ai and the synthesis- 
verification loop is repeated. If the verification phase succeeds, in- 
stead of returning the result as is done in the C EG IS algorithm, the 
CEGISMIN algorithm computes the value of ninHole from the con- 
straint set 4>, stores the current satisfiable hole solution ^ in 0p, and 
adds an additional constraint {minHole<minHoleVal} to the con- 
straint set $i. The synthesis-verification loop is then repeated with 
this additional constraint to find a conforming value for the minHole 
variable that is smaller than the current value in (p. 

4.3 Mapping SKETCH solution to generate feedback 

Each correction rule in the error model £ is associated with a 
feedback message, e.g. the integer variable initialization correction 
rule V = n ^ V = {n + 1} in the computeDeriv error model 
is associated with the message "Increment the right hand side 
of the initialization by 1". After the SKETCH synthesizer finds 
a solution to the constraints, the tool maps back the values of 
unknown integer holes to their corresponding expression choices. 
These expression choices are then mapped to natural language 
feedback using the messages associated with the corresponding 
correction rules, together with the Une numbers. 

5. Implementation and Experiments 

We now briefly describe some of the implementation details of the 
tool, and then describe the experiments we performed with it. 

5.1 Implementation and Features 

The tool's frontend that converts a python program to a Sketch 
program is implemented in python itself and uses the python ast 
module for parsing and rewriting ASTs. The backend system that 
solves the SKETCH program and provides feedback is implemented 
as a wrapper over the SKETCH system extended with the CEGISMIN 
algorithm. Error models in our tool are currently written in python 
in terms of the python AST. The tool also provides a mechanism to 
assign different cost measure to correction rules in the error model 
to account for different levels of mistakes. 

5.2 Benchmarks 

We created our benchmark set with problems taken from the Intro- 
duction to Programming course at MIT (6.00) and the EdX ver- 
sion of the class (6.00x). The prodBySum problem asks to com- 
pute the product using the sum operator, the oddluples problem 
asks to compute a list consisting of alternate elements of the input 
Ust, the evalPoly problem asks to compute the value of a polyno- 
mial on a given value, the iterPower (and recurPower) problem 
asks to compute the exponentiation using multiplication and the 
iterGCD problem computes the gcd of two numbers. We also cre- 
ated a few AP-level loop-over-arrays and dynamic programming 
problems^ on Pex4Fun to show the scalability and applicability 
of our technique to other languages such as C#. 

5.3 Experiments 

Performance Table 1 shows the number of student attempts cor- 
rected for each benchmark problem as well as the time taken by 
the tool to provide the feedback. The experiments were performed 
on a 2.4GHz Intel Xeon CPU with 16 cores and 16GB RAM. The 
experiments were performed with bounds of 4 bits for input integer 
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values and maximum length 4 for input Usts. For each benchmark 
problem, we first removed the student attempts with syntax errors 
to get the Test Set on which we ran our tool. We then separated the 
attempts which were correct to measure the effectiveness of the tool 
on the incorrect attempts. The tool was able to provide appropriate 
corrections as feedback for 65% of all incorrect student attempts 
in around 10 seconds on average. The remaining 35% of incorrect 
student attempts on which the tool could not provide feedback fall 
in one of the following categories: 

• Completely incorrect solutions: We observed many student 
attempts that were empty or performing trivial computations 
such as printing variables. 

• Big conceptual errors: A common error we found in the case 
of eval-poly-6.00x was that a large fraction of incorrect at- 
tempts (260/541) were using the list function index to get the 
index of a value in the list, whereas the index function returns 
the index of first occurrence of the value in the list. Some other 
similar mistakes involve introducing or moving program state- 
ments from one place to another. These mistakes can not be 
corrected with the application of a set of local correction rules. 

• Unimplemented features: Our implementation currently lacks 
a few of the complex python features such as pattern matching 
on list enumerate function. 

• Timeout: In our experiments, we found less than 1% of the 
student attempts timed out (set as 2 minutes). 

Number of Corrections The number of student attempts that 
require different number of corrections are shown in Figure 13(a). 
We observe from the figure that a large fraction of the problems 
require 3 and 4 coordinated corrections, and to provide feedback on 
such attempts, we need a technology like ours that can symbolically 
encode the outcome of different corrections on all input values. 

Repetitive Mistakes In this experiment, we validate our hypothe- 
sis that students make similar mistakes while solving a given prob- 
lem. The graph in Figure 13(b) shows the number of student at- 
tempts corrected as more rules are added to the error model of the 
benchmark problems. As can be seen in the figure, adding a sin- 
gle rule to the error model can lead to correction of hundreds of 
attempts. 

Generalization of Error Models In this experiment, we check the 
hypothesis that whether the correction rules generalize across prob- 
lems of similar kind. The result of running the compute -deriv error 
model on other benchmark problems is shown in figrefgraphs(c). 
As expected, it does not perform as well as the problem-specific 
error models, but it does suffice to fix a fraction of the incor- 
rect attempts and also provides a good starting point to add more 
problem-specific rules. 

6. Related Work 

In this section, we describe several related work to our technique 
from the areas of automated programming tutors, automated pro- 
gram repair, fault localization, automated debugging and program 
synthesis. 

6.1 AI based programming tutors 

There has been a lot of work done in the AI community for building 
automated tutors for helping novice programmers learn program- 
ming by providing feedback about semantic errors. These tutoring 
systems can be categorized into the following two major classes: 

Code-based matching approaches: LAURA [2] converts 
teacher's and student's program into a graph based representation 
and compares them heuristically by applying program transforma- 
tions while reporting mismatches as potential bugs. TALUS [22] 




Figure 13. (a) The number of incorrect problem that require different number of corrections, (b) the number of problems corrected with 
adding rules to the error models, and (c) the performance of compute-deriv error model on other problems. 



Benchmark 


Total 
Attempts 


Syntax 
Errors 


Test Set 


Correct 


Incorrect 
Attempts 


Generated 
Feedback 


Average 
Time(in s) 


Median 
Time(in s) 


prodBySum-6 . 00 


1056 


16 


1040 


772 


268 


218(81.3%) 


2.49s 


2.53s 


oddTuples-6.00 


2386 


1040 


1346 


1002 


344 


185 (53.8%) 


2.65s 


2.54s 


computeDeriv-6 . 00 


144 


20 


124 


21 


103 


88 (85.4%) 


12.95s 


4.9s 


evalPoly-6.00 


144 


23 


121 


108 


13 


6(46.1%) 


3.35s 


3.01s 


computeDeriv-6 . 00x 


4146 


1134 


3012 


2094 


918 


753 (82.1%) 


12.42s 


6.32s 


evalPoly-6.00x 


4698 


1004 


3694 


3153 


541 


167 (30.9%) 


4.78s 


4.19s 


oddTuples-6.00x 


10985 


5047 


5938 


4182 


1756 


860 (48.9%) 


4.14s 


3.77s 


iterPower-6 . 00x 


8982 


3792 


5190 


2315 


2875 


1693 (58.9%) 


3.58s 


3.46s 


recurPower-6.00x 


8879 


3395 


5484 


2546 


2938 


2271 (77.3%) 


10.59s 


5.88s 


iterGCD-6.00x 


6934 


3732 


3202 


214 


2988 


2052 (68.7%) 


17.13s 


9.52s 


stock-ma rket-I(C#) 


52 


11 


41 


19 


22 


16 (72.3%) 


7.54s 


5.23s 


stock-ma rket-II(C#) 


51 


8 


43 


19 


24 


14 (58.3%) 


11.16s 


10.28s 


restaurant rush (C#) 


124 


38 


86 


20 


66 


41 (62.1%) 


8.78s 


8.19s 



Table 1. The percentage of student attempts corrected and the time taken for correction for the benchmark problems. 



matches a student's attempt with a collection of teacher's algo- 
rithms. It first tries to recognize the algorithm used and then ten- 
tatively replaces the top-level expressions in the student's attempt 
with the recognized algorithm for generating correction feedback. 
The problem with these approach is that the enumeration of all 
possible algorithms (with its variants) for covering all corrections 
is very large and tedious on part of the teacher. 

Intention-based matching approaches: LISP tutor [8] creates 
a model of the student goals and updates it dynamically as the stu- 
dent makes edits. The drawback of this approach is that it forces 
students to write code in a certain pre-defined structure and limits 
their freedom. MENO-II [26] parses student programs into a deep 
syntax tree whose nodes are annotated with plan tags. This anno- 
tated tree is then matched with the plans obtained from teacher's 
solution. PROUST [17], on the other hand, uses a knowledge base 
of goals and their corresponding plans for implementing them for 
each programming problem. It first tries to find correspondence of 
these plans in the student's code and then performs matching to find 
discrepancies. CHIRON [23] is its improved version in which the 
goals and plans in the knowledge base are organized in a hierar- 
chical manner based on their generality and uses machine learning 
techniques for plan identification in the student code. These ap- 
proaches require teacher to provide all possible plans a student can 
use to solve the goals of a given problem and do not perform well if 
the student's attempt uses a plan not present in the knowledge base. 

Our approach performs semantic equivalence of student's at- 
tempt and teacher's solution based on exhaustive bounded sym- 



bolic verification techniques and makes no assumptions on the al- 
gorithms or plans that students can use for solving the problem. 
Moreover, our approach is modular with respect to error models; 
the local correction rules are provided in a declarative manner and 
their complex interactions are handled by the solver itself. 

6.2 Automated Program Repair 

Konighofer et. al. [20] present an approach for automated error lo- 
calization and correction of imperative programs. They use model- 
based diagnosis to localize components that need to be replaced 
and then use a template-based approach for providing corrections 
using SMT reasoning. Their fault model only considers the right 
hand side (RHS) of assignment statements as replaceable compo- 
nents. The approaches in [16, 28] frame the problem of program 
repair as a game between an environment that provides the inputs 
and a system that provides correct values for the buggy expressions 
such that the specification is satisfied. These approaches only sup- 
port simple corrections (e.g. correcting RHS side of expressions) 
in the fault model as they aim to repair large programs with arbi- 
trary errors. In our setting, we exploit the fact that we have access 
to the dataset of previous student mistakes that we can use to con- 
struct a concise and precise error model. This enables us to model 
more sophisticated transformations such as introducing new pro- 
gram statements, replacing LHS of assignments etc. in our error 
model. Our approach also supports minimal cost changes to stu- 
dent's programs where each error in the model is associated with a 
certain cost, unlike the earlier mentioned approaches. 



Mutation-based program repair [6] performs mutations repeat- 
edly to statements in a buggy program in order of their suspicious- 
ness until the program becomes correct. The large state space of 
mutants (10^"^) makes this approach infeasible. Our approach uses 
a symbolic search for exploring correct solutions over this large set. 
There are also some genetic programming approaches that exploit 
redundancy present in other parts of the code for fixing faults [3, 9]. 
These techniques are not applicable in our setting as such redun- 
dancy is not present in introductory programming problems. 

6.3 Automated Debugging and Fault localization 

Techniques like Delta Debugging [30] and QuickXplain [19] aim 
to simplify a failing test case to a minimal test case that still exhibits 
the same failure. Our approach can be complemented with these 
techniques to restrict the application of rewrite rules to certain 
failing parts of the program only. There are many algorithms for 
fault localization [4, 10] that use the difference between faulty 
and successful executions of the system to identify potential faulty 
locations. Jose et. al. [18] recently suggested an approach that uses 
a MAX-SAT solver to satisfy maximum number of clauses in a 
formula obtained from a failing test case to compute potential error 
locations. These approaches, however, only localize faults for a 
single failing test case and the suggested error location might not be 
the desired error location, since we are looking for common error 
locations that cause failure of multiple test cases. Moreover, these 
techniques provide only a limited set of suggestions (if any) for 
repairing these faults. 

6.4 Program Syntliesis 

Program synthesis has been used recently for many applications 
such as synthesis of efficient low-level code [21, 25], inference of 
efficient synchronization in concurrent programs [29], snippets of 
excel macros [13], relational data structures [14, 15] and angelic 
programming [5]. The SKETCH tool [24, 25] takes a partial pro- 
gram and a reference implementation as input and uses constraint- 
based reasoning to synthesize a complete program that is equivalent 
to the reference implementation. In general cases, the template of 
the desired program as well as the reference specification is un- 
known and puts an additional burden on the users to provide them; 
in our case we use the student's solution as the template program 
and teacher's solution as the reference implementation. A recent 
work by Gulwani et. al. [ 1 2] also uses program synthesis techniques 
for automatically synthesizing solutions to ruler/compass based ge- 
ometry construction problems. Their focus is primarily on finding 
a solution to a given geometry problem whereas we aim to provide 
feedback on a given programming exercise solution. 

7. Conclusions 

In this paper, we presented a new technique of automatically pro- 
viding feedback for introductory programming assignments that 
can complement manual and test-cases based techniques. The tech- 
nique uses an error model describing the potential corrections and 
constraint-based synthesis to compute minimal corrections to stu- 
dent's incorrect solutions. We have evaluated our technique on a 
large set of benchmarks and it can correct 65% of incorrect solu- 
tions. We believe this technique can provide a basis for providing 
automated feedback to hundreds of thousands of students learn- 
ing from online introductory programming courses that are being 
taught by MITx and Udacity. 
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